DOCUMENT RESUME 



ED 466 269 



JC 020 476 



AUTHOR 

TITLE 



INSTITUTION 

PUB DATE 
NOTE 



AVAILABLE FROM 



PUB TYPE 
EDRS PRICE 
DESCRIPTORS 



IDENTIFIERS 



Bahr, Peter Riley 

Student Average Academic Preparation: The Development of a 
College-Level Summary Measure of Student Preparedness for 
Academic Coursework. 

California Community Colleges, Sacramento. Office of the 
Chancellor . 

2002-05-03 

12p . ; Paper presented at the Annual Meeting of the Research 
and Planning Group for California Community Colleges (40th, 
Pacific Grove, CA, May 1-3, 2002). 

For full text: 

http : //www.cccco . edu/ divisions/tris/rp/ rp_doc/rp2 0 02__saap . pd 
f. • 

Reports - Evaluative (142) -- Speeches/Meeting Papers (150) 

MF01/PC01 Plus Postage. 

^College Outcomes Assessment; ^College Preparation; 
^Community Colleges; Comparative Analysis; Educational 
Environment; * Performance Based Assessment; Research 
Methodology; Two Year Colleges 

^California Community Colleges; Stanford Achievement Tests 



ABSTRACT 



The Board of Governors of California's Community College 
System, in executing California's Partnership for Excellence (PFE) Program, 
has recognized that the colleges operate within remarkably disparate social 
and economic environments, and that these differences include variation in 
factors that are likely to affect the performance of colleges on the 
predetermined outcome measures . The attempt to compensate for such 
disparities has resulted in an adjustment modeling process. Adjustment models 
are statistically derived equations that "adjust" for observed relationships 
between external variables and each of the PFE outcomes. This document 
describes the process the Chancellor's Office used to match college 
enrollment records against standard test results without the use of social 
security numbers or other unique identifiers. The subsequent Student Average 
Academic Preparation (SAAP) measure proved to be a significant and positive 
adjustment factor in three of the five PFE adjustment models, including basic 
skills improvement, course completion, and transfer. According to the paper, 
the SAAP represents a substantial step forward in the systemwide efforts to 
account for the disparate conditions affecting the performance of each of 
California's community colleges. (EMH) 
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ABSTRACT 

California’s community college performance based funding strategy - the Partnership for 
Excellence program - includes, as an aspect of the outcome assessment component of the 
program, a mechanism to “level the playing field” between colleges. This function is 
accomplished through adjustment models: statistically derived equations that “adjust” for 
observed relationships between exogenous variables and college-level outcomes of interest. The 
development of adjustment models for each of the several outcomes has relied upon an 
exploratory process to derive a parsimonious set of exogenous variables with nonzero 
(statistically significant) relationships to the outcome of interest. One previously unmeasured 
adjustment variable has received considerable interest in discussions of the adjustment model 
developmental process, namely the academic preparedness of entering students at each college. 
This paper addresses the recent work of the California Community College’s Chancellor’s Office 
to develop a measure of student average academic preparation for use in the “leveling the 
playing field” aspect of community college outcome measurement and accountability. 
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Student Average Academic Preparation : 

The Development of a College-Level Summary Measure 
of Student Preparedness for Academic Coursework 



BACKGROUND 

The Board of Governors of California’s Community College system, in executing California’s 
Partnership for Excellence (PFE) program, has recognized formally that the colleges operate 
within remarkably disparate social and economic environments, and that these differences 
include variation in factors that are likely to affect the performance of colleges on the 
predetermined outcome measures. The recognition of, and attempt to correct for, such disparities 
has taken the form of the implementation of an “adjustment modeling” process. Adjustment 
models are statistically derived equations that “adjust” for observed relationships between 
exogenous variables (factors that are not within the purview of control of the individual colleges 
and districts) and each of the PFE outcomes. 



THE PROBLEM 

While the exploratory process of selecting a parsimonious set of adjustment variables has drawn 
on numerous data sources and examined a variety of possible adjustment factors, considerable 
attention has been focused on one important adjustment factor for which data were unavailable. 
This factor, dubbed student average academic preparation (SAAP), would represent the relative 
academic preparedness of entering students at each college. The academic preparedness of the 
incoming student population was expected to be a predominant factor affecting the performance 
of each college on accountability measures derived from student outcomes. 



THE ANSWER 

The Chancellor’s Office of the California Community College system, in an effort to adjust for 
the effects of differences in student preparedness on the performance of each college, forged a 
data sharing alliance with the California Department of Education (CDE). In 1998, CDE 
implemented statewide testing of public school students using the Stanford 9 test battery as one 
component of California’s Standardized Testing and Reporting (STAR) program. The Stanford 
9 test includes five subject areas (mathematics, reading, language, history/social science, and 
science) and is one measure employed by CDE to assign an Academic Performance Index (API) 
score to each public school in California. 

CDE agreed to share with the Chancellor’s Office the Stanford 9 test results for public high 
school juniors for the two terms for which data were available at the time of the initiation of the 
data sharing: spring semester 1998 and spring semester 1999. The intention of the Chancellor’s 
Office was to cross reference the Stanford 9 test results with the Fall 2000 cohort of incoming 
first-time freshmen at each college, and, by calculating the mean of each cohort’s Stanford 9 test 
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results, develop an index of the average academic preparation of incoming freshmen at each 
college. 



A PLOT TWIST 

Unfortunately, CDE does not have social security numbers or other relevant unique identifiers 
for student’s Stanford 9 test results, precluding a unique match against Chancellor’s Office 
records. Thus, a simple match against college enrollment records was not possible 



A “FUZZY” SOLUTION 

To remedy this problem, the Chancellor’s Office developed a “fuzzy match” process to connect 
student records with the CDE Stanford 9 test results. The fuzzy match relied upon the combined 
uniqueness of multiple student-level descriptive variables to connect student records across the 
two datasets. The four variables selected for the fuzzy match were gender, birth date, 
race/ethnicity, and high school of origin (in the case of the Stanford 9 data, high school of 
enrollment at the time of test administration). 

Further complicating the situation, the Stanford 9 data includes both a primary race/ethnicity 
variable and a secondary variable indicating one or more additional racial/ethnic identifications, 
while the Chancellor’s Office data includes only a single racial/ethnic identification variable. A 
fuzzy match employing only the primary race/ethnicity variable in both datasets would have 
been a reasonable method of matching records. However, with the goal of using all available 
information to maximize the percentage of matched records across the two datasets, the 
Chancellor’s Office expanded the matching process to capitalize on the data contained within 
this secondary race/ethnicity variable. 

In simple terms, the matching process involved five stages: 

1 . First-time freshmen from the Fall semester/quarter of 2000 were identified. 

2. This first-time freshmen cohort was screened to eliminate all students who were younger 
than 1 7 years of age at first enrollment, older than 22 years of age at first enrollment, or 
who did not specify a valid California high school as their high school of origin. 

3. The student records of this reduced cohort were then matched against the Stanford 9 test 
data (1998 and 1999 combined) using the combination of four variables mentioned 
above: birth date, high school, gender and primary race/ethnicity. 

4. The matched records from the previous step were set aside, and the remaining unmatched 
first-time freshmen were matched against the Stanford 9 test data using the same four 
variables with the exception that the secondary race/ethnicity variable in the Stanford 9 
data was used instead of the primary race/ethnicity variable. However, this step of the 
matching process included only Stanford 9 records with a single racial/ethnic 
identification in the secondary race/ethnicity variable. In other words, Stanford 9 test 
takers had the option of coding multiple racial/ethnic identifications in the secondary 



race/ethnicity variable, and all students who did so were eliminated from the matching 
process accomplished in this step. 

5. The matched students records from steps 3 and 4 were combined. 



DEDUPLICATING MULTIPLY-MATCHED STUDENT RECORDS 

Despite the relatively unique combination of birth date, high school, gender, and race/ethnicity, a 
number of duplicate observations were generated during the matching process. Duplicate 
observations constitute a single community college student for whom multiple Stanford 9 records 
were matched on the basis of the four variables discussed above. One would expect that 
duplicate observations would be particularly problematic in colleges that draw students from 
relatively few high schools containing populations that are relatively homogenous with regard to 
race/ethnicity. 

Because the Stanford 9 test data lacks a unique identifier, it is impossible to determine which of 
the several Stanford 9 records were matched correctly to a given freshmen college student. 
Duplicate matches must be either eliminated entirely (dropping all students for whom multiple 
matches occurred) or eliminated at random such that only a single test score match remains. The 
Chancellor’s Office elected for the latter of these two options - eliminating all but the first 
occurrence of multiply-matched student records - in order to maximize the match rate for each 
college. Furthermore, because the original CDE data were not sorted by test score prior to the 
matching process, the elimination of all but the first match of multiply matched student records 
(as opposed to eliminating all but the second match, etc.) is not believed to result in any 
systematic change, by college, in average Stanford 9 test scores. 



ASSESSING THE REPRESENTATIVENESS OF THE MATCHED GROUP 

Prior to matching, the Chancellor’s Office identified 103,929 unique student records meeting the 
criteria of: (1) first term of enrollment in Summer 2000 or Fall 2000, (2) originating from a 
California high school, and (3) age of no more than 22 years and no less than 1 7 years. After 
completing the fuzzy match process and deduplicating matched records, Stanford 9 test scores 
were matched to 58,565 students for an overall match rate of 56.35%. Of these, 53,955 students 
had valid test scores on all five tests, leading to practical match rate of 5 1 .92%. Match rates by 
gender and race/ethnicity are provided below in Tables 1 and 2, respectively. The findings 
presented in these tables suggest reasonably equal match rates across categories of gender and 
predominant categories of race/ethnicity. 

Match rates by college varied from a low of 6.57% to a high of 68.33%. Excluding the lowest 
match rate of 6.57%, the next lowest college match rate was 13.17%, followed by 25.00%. 
Descriptive statistics for match rate by college are provided in Table 3, and a histogram 
representing the distribution (proportion) of colleges by match rate is presented in Figure 1. 
While not equal across colleges, match rates appear to be sufficiently large at most of the 
colleges to warrant a reasonable degree of confidence in aggregate statistics derived from the 
match. 
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TABLE 1: Percent match by gender for all first-time freshmen, ages 17 to 22, originating 
in California high schools 



Gender 


N for all First-Time 
Freshmen Ages 17 to 22 
Originating in California 
High Schools 


Percent with Five 
Valid Test Scores 


Male 


50,418 


51.57 


Female 


53,014 


52.73 


Nonreporting 


497 


0.00 



TABLE 2: Percent match by race/ethnicity for all first-time freshmen, ages 17 to 22, 
originating in California high schools 


Race/Ethnicity 


N for all First-Time 
Freshmen Ages 17 to 22 
Originating in California 
High Schools 


Percent with Five 
Valid Test Scores 


White 


40,977 


58.85 


Black 


6,470 


44.13 


Hispanic 


33,189 


55.86 


Asian 


10,492 


54.96 


Pacific Islander 


782 


26.73 


Filipino 


3,975 


49.96 


Native American 


960 


22.60 


Other 


2,263 


9.59 


Nonreporting 


4,821 


1.10 



TABLE 3: Descriptive statistics for percent match by college 



Mean 


48.65 


Standard Deviation 


11.16 


Median 


49.50 


25 th Percentile 


42.69 


75 th Percentile 


56.90 


Minimum 


6.57 


Maximum 


68.33 


Skewness 


-0.82 


Kurtosis 


4.34 


N 


106 
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FIGURE 1: Histogram of the distribution (proportion) of colleges by match rate (N=106) 
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Proportion of first-time freshmen for whom matches was obtained 



CALCULATING THE COLLEGE-LEVEL SAAP SCORE 



The SAAP score for each college is a simple, unweighted mean of the average of each student’s 
five normal curve equivalent test scores. Stated briefly, CDE provided the nationally 
standardized normal curve equivalent score (a percentile) for each of the five tests for each 
student. The Chancellor’s Office averaged these normal curve equivalent scores across each 
student, with equal weights given to each of the five tests. The averages of the five tests for the 
students were then collapsed (averaged) to the level of the college to give a summary mean of 
the means of the five tests for matched students. Descriptive statistics for the average of the five 
normal curve equivalent scores for all students are provided in Table 4, and descriptive statistics 
for the college-level SAAP score are provided in Table 5. The distribution (proportion) of 
colleges by SAAP score is presented in Figure 2. 

A review of the Table 4, Table 5, and Figure 2 reveals a bell-shaped distribution for the 
aggregate SAAP score and fairly low degree of variation relative to the variation in student-level 
scores. For example, while the interquartile range (IQR) of student-level scores is 22.94, the 
aggregate SAAP has an IQR of only 7.94. A comparison of standard deviations reveals a 
similarly reduced level of variation in the SAAP (15.78 versus 5.12). Means and medians are 
similar both within and between the student-level and college-level measures. 



TABLE 4: Descriptive statistics for the average of the five Stanford 9 normal curve 
equivalent scores for all matched students 



Mean 


48.42 


Standard Deviation 


15.78 


Median 


47.70 


25 th Percentile 


36.62 


75 lh Percentile 


59.56 


Minimum 


0.8 


Maximum 


99.0 


Skewness 


0.22 


Kurtosis 


2.55 


N 


53,955 
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TABLE 5: Descriptive statistics for the college-level Student Average Academic Preparation 
(SAAP) score 



Mean 


47.81 


Standard Deviation 


5.12 


Median 


48.51 


25 th Percentile 


43.85 


75 th Percentile 


51.79 


Minimum 


30.83 


Maximum 


61.34 


Skewness 


-0.36 


Kurtosis 


3.26 


N 


106 



FIGURE 2: Histogram of the distribution (proportion) of colleges by Student Average 
Academic Preparation Score (N=106) 
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THE USE OF THE SAAP MEASURE IN ADJUSTMENT MODELS 



The SAAP measure proved to be a statistically significant and positive adjustment factor in three 
of the five PFE adjustment models, including the basic skills improvement model, the course 
completion model, and the transfer model. Likewise, the measure was found to be statistically 
significant and positive in adjustment models developed by the Chancellor’s Office for the 
Persistently Low Transfer College (PLTC) study. The measure was not found to be statistically 
significant in the degree/certificate completion PFE adjustment model or the vocational course 
completion PFE adjustment model, after accounting for other adjustment factors. 



PROBLEMS AND FUTURE DIRECTIONS 

A number of unresolved weaknesses in the SAAP measure are immediately evident. First, the 
measure addresses only the academic preparation of recent high school students and fails to 
address the impact that the academic preparedness of “nontraditional” students and early high 
school “dropouts” may have on the performance of a college. Second, the measure addresses 
only students of California high schools, which is particularly problematic for community 
colleges near the borders of California where the influx of nonresident students could be 
expected to be relatively high. Third, the measure addresses only public high school students, 
excluding students of private high schools and home schools. Fourth, at present the measure is 
calculated for only one year (Fall 2000), although this problem will be remedied as additional 
waves of Stanford 9 data are made available by CDE. Fifth, the year for which the SAAP 
measure is calculated is several years after the baseline years addressed by the PFE and PLTC 
adjustment models, the consequence of which is the unverified assumption of relative continuity 
in the academic preparation of incoming college freshmen at each college. Finally, the rules of 
administration for the Stanford 9 test precludes certain segments of the public high school 
population, most notably students who have been enrolled in a given school district for less than 
one year, suggesting that transient populations may be underrepresented in the Stanford 9 data 
and the SAAP measure. 

Future work on the SAAP measure is expected to include an expansion in the number of first- 
time freshmen cohorts addressed by the measure, cross-year validation of the assumption of 
continuity in relative student average academic preparedness, and college-level validation of the 
representativeness of the match. 



CONCLUSION 

The SAAP measure developed by the Chancellor’s Office represents a substantial step forward 
in the systemwide efforts to account for the disparate exogenous conditions affecting the 
performance of each of California’s community colleges. While the measure is not without 
weaknesses, it still satisfies, at least in part, an essential and long-standing need voiced by 
numerous researchers and administrators in the California Community College system. 
Moreover, although the measure was constructed to meet the immediate requirements of 
adjustment modeling for the purpose of accountability, it has the potential to contribute to the 
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advancement of many future research efforts aimed at understanding the dynamics of education 
in community colleges. However, with due consideration given to the temporal and budgetary 
restrictions present at the time of development, it is recognized that the measure is in its infancy 
and that improvements and refinements on the measure should and will continue. 
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